In this report, I will be focusing on comparing publisher information, checkouts per month, and distribution type from 2018 to 2023.
I will be using the Seattle Public Library’s checkouts data to answer the following questions:
How has the number of checkouts by material type changed over time?
How do the top 5 publishers compare in terms of checkouts over time?
How do the top 5 publishers by checkouts compare in terms of distribution material type?
What are the most checked out items from the top publishers
In my analysis of the Seattle Public Library’s checkouts data, I found that the number of checkouts for CDs and DVDs have been steadily decreasing since 2018. The number of checkouts for books has been heavily fluctuating since 2018. The number of checkouts for audiobooks and eBooks have been steadily increasing since 2018, with Audiobooks being the most checked out material type of 2023 so far.
The most checked out item from 2018-2023 was Educated: A Memoir with 17817 checkouts.
The top 5 publishers by checkouts are:
The top item for each of the top 5 publishers is:
The dataset is collected and published by the Seattle Public Library. The dataset includes items that were checked out from the Seattle Public Library more than 5 times from 2018 to 2023. The dataset includes the following metadata: Material Type, Checkout Month/Year, Subject, Publisher, Publishing Year, Book Title, and ISBN. The data was collected by the Seattle Public Library’s circulation department. It was collected to help the Seattle Public Library understand the types of materials that are being checked out by the public, as well as keep a record of what books are currently checked out. It includes 12 features and 816,354 observations.
We might need to consider the ethical questions of using a dataset that excludes certain types of materials. For example, the dataset excludes items checked out less than 5 times. This could be problematic because it excludes items checked out less than 5 times, which could be a large portion of the library’s collection. Additionally, we may need to consider what types of content are banned from libraries and why.
The data in this dataset is limited to the Seattle Public Library’s collection. This means that the data is representative of only some of the Seattle population. For example, the dataset does not include items that were checked out less than 5 times. This could be problematic because it excludes items checked out less than 5 times, which could be a large portion of the library’s collection. Additionally, the data in this dataset is not uniform and needs trimming to represent publisher information accurately. The subject category was hard to narrow down due to publishers having multiple branches and sloppy encoding.
This chart graphs checkouts from the top 5 material types by average checkouts from 2018-2023. The chart shows that the number of checkouts for CDs and DVDs has been steadily decreasing since 2018. The number of checkouts for books has been heavily fluctuating since 2018. The number of checkouts for audiobooks and eBooks has been steadily increasing since 2018, with audiobooks being the most checked-out material type of 2023 so far. I made this a line graph because I wanted to capture the change in checkouts by material type during COVID-19, and a line graph allowed me to easily visualize the trend.
This chart visualizes total checkouts by month for the top 5 publishers by overall checkouts. I included this because a line graph allows us to easily visualize the popularity trend of each of the top 5 publishers over the last five years. The chart shows that the number of checkouts for each the top 5 publishers have been steadily increasing since 2018. Books on Tape, Inc. has been the most popular publisher since 2018, with the number of checkouts for this publisher increasing by 25,000 average checkouts per month from 2018 to 2023. Penguin Group (USA), Inc began as the most popular publisher in 2018 and had an extreme spike in late 2020. However, their checkout numbers fell drastically, and they are now the third most popular publisher.
Finally, I made a faceted scatterplot that graphed the checkout year by the checkout amount for each publisher. The color of the points represents the material type of the title. I chose a scatterplot because I thought that it would be interesting to see the relationship between the checkout year, the amount of checkouts and the material type. I felt that a scatterplot would create a very interesting visual between the three variables, and it would highlight anomalies and important outliers in the data.
Books on Tape is entirely made up of audiobooks, so it is not surprising that all of their checkouts are audiobooks. It is interesting that no publisher spiked in the same year. Most notably seen from the data, eBooks and Audiobooks titles have been spiking since COVID, and the number of checkouts for certain titles in these formats are thousands of checkouts above any other format during this time period.